翻訳と辞書
Words near each other
・ "O" Is for Outlaw
・ "O"-Jung.Ban.Hap.
・ "Ode-to-Napoleon" hexachord
・ "Oh Yeah!" Live
・ "Our Contemporary" regional art exhibition (Leningrad, 1975)
・ "P" Is for Peril
・ "Pimpernel" Smith
・ "Polish death camp" controversy
・ "Pro knigi" ("About books")
・ "Prosopa" Greek Television Awards
・ "Pussy Cats" Starring the Walkmen
・ "Q" Is for Quarry
・ "R" Is for Ricochet
・ "R" The King (2016 film)
・ "Rags" Ragland
・ ! (album)
・ ! (disambiguation)
・ !!
・ !!!
・ !!! (album)
・ !!Destroy-Oh-Boy!!
・ !Action Pact!
・ !Arriba! La Pachanga
・ !Hero
・ !Hero (album)
・ !Kung language
・ !Oka Tokat
・ !PAUS3
・ !T.O.O.H.!
・ !Women Art Revolution


Dictionary Lists
翻訳と辞書 辞書検索 [ 開発暫定版 ]
スポンサード リンク

estimation of covariance matrices : ウィキペディア英語版
estimation of covariance matrices

In statistics, sometimes the covariance matrix of a multivariate random variable is not known but has to be estimated. Estimation of covariance matrices then deals with the question of how to approximate the actual covariance matrix on the basis of a sample from the multivariate distribution. Simple cases, where observations are complete, can be dealt with by using the sample covariance matrix. The sample covariance matrix (SCM) is an unbiased and efficient estimator of the covariance matrix if the space of covariance matrices is viewed as an extrinsic convex cone in R''p''×''p''; however, measured using the intrinsic geometry of positive-definite matrices, the SCM is a biased and inefficient estimator. In addition, if the random variable has normal distribution, the sample covariance matrix has Wishart distribution and a slightly differently scaled version of it is the maximum likelihood estimate. Cases involving missing data require deeper considerations. Another issue is the robustness to outliers: "Sample covariance matrices are extremely sensitive to outliers".〔''Robust Statistics'', Peter. J. Huber, Wiley, 1981 (republished in paperback, 2004)〕〔"Modern applied statistics with S", William N. Venables, Brian D. Ripley, Springer, 2002, ISBN 0-387-95457-0, ISBN 978-0-387-95457-8, page 336〕
Statistical analyses of multivariate data often involve exploratory studies of the way in which the variables change in relation to one another and this may be followed up by explicit statistical models involving the covariance matrix of the variables. Thus the estimation of covariance matrices directly from observational data plays two roles:
:
* to provide initial estimates that can be used to study the inter-relationships;
:
* to provide sample estimates that can be used for model checking.
Estimates of covariance matrices are required at the initial stages of principal component analysis and factor analysis, and are also involved in versions of regression analysis that treat the dependent variables in a data-set, jointly with the independent variable as the outcome of a random sample.
==Estimation in a general context==
Given a sample consisting of ''n'' independent observations ''x''1,..., ''x''''n'' of a ''p''-dimensional random vector ''X'' ∈ R''p''×1 (a ''p''×1 column-vector), an unbiased estimator of the (''p''×''p'') covariance matrix
:\operatorname(X) = \operatorname\left\right)^\mathrm\right]
is the sample covariance matrix
:\mathbf = ^n (x_i-\overline)(x_i-\overline)^\mathrm,
where \textstyle x_i is the ''i''-th observation of the ''p''-dimensional random vector, and
:\overline =\left = ^n x_i
is the sample mean.
This is true regardless of the distribution of the random variable ''X'', provided of course that the theoretical means and covariances exist. The reason for the factor ''n'' − 1 rather than ''n'' is essentially the same as the reason for the same factor appearing in unbiased estimates of sample variances and sample covariances, which relates to the fact that the mean is not known and is replaced by the sample mean.
In cases where the distribution of the random variable ''X'' is known to be within a certain family of distributions, other estimates may be derived on the basis of that assumption. A well-known instance is when the random variable ''X'' is normally distributed: in this case the maximum likelihood estimator of the covariance matrix is slightly different from the unbiased estimate, and is given by
:\mathbf = \sum_^n (x_i-\overline)(x_i-\overline)^\mathrm.
A derivation of this result is given below. Clearly, the difference between the unbiased estimator and the maximum likelihood estimator diminishes for large ''n''.
In the general case, the unbiased estimate of the covariance matrix provides an acceptable estimate when the data vectors in the observed data set are all complete: that is they contain no missing elements. One approach to estimating the covariance matrix is to treat the estimation of each variance or pairwise covariance separately, and to use all the observations for which both variables have valid values. Assuming the missing data are missing at random this results in an estimate for the covariance matrix which is unbiased. However, for many applications this estimate may not be acceptable because the estimated covariance matrix is not guaranteed to be positive semi-definite. This could lead to estimated correlations having absolute values which are greater than one, and/or a non-invertible covariance matrix.
When estimating the cross-covariance of a pair of signals that are wide-sense stationary, missing samples do ''not'' need be random (e.g., sub-sampling by an arbitrary factor is valid).

抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)
ウィキペディアで「estimation of covariance matrices」の詳細全文を読む



スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース

Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.